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Abstract. Consider an undirected weighted graph G = {V, E) with \V\ = n and \E\ = m, where each 
vertex « G is assigned a label from a set of labels L = {Ai,...,Af}. We show how to construct a 
compact distance oracle that can answer queries of the form: "what is the distance from v to the closest 
A-labeled node" for a given node v £V and label A G L. 

This problem was introduced by Hermelin, Levy, Weimann and Yuster [ICALP 2011] where they present 
several results for this problem. In the first result, they show how to construct a vertex-label distance 
oracle of expected size 0(fcni+i/'=) with stretch (4fc - 5) and query time 0(k). In a second result, they 
show how to reduce the size of the data structure to 0{kn£^^'') at the expense of a huge stretch, the 
stretch of this construction grows exponentially in k, (2*° — 1). In the third result they present a dynamic 
vertex-label distance oracle that is capable of handling label changes in a sub-linear time. The stretch 
of this construction is also exponential in k, (2 • S''"^ + 1). 

We manage to significantly improve the stretch of their constructions, reducing the dependence on k 
from exponential to polynomial (4fc — 5), without requiring any tradeoff regarding any of the other 
variables. 

In addition, we introduce the notion of vertex-label spanners: subgraphs that preserve distances between 
every node v £ V and label X £ L. We present an efficient construction for vertex-label spanners with 
stretch-size tradeoff close to optimal. 

1 Introduction 

An approximate distance oracle for a given graph G = {V, E) is a processed data structure that, given two 
nodes s and t, can quickly return an approximation of dist(s, i, G), the distance between s and t in G. [To 
ease notation, we let dist(s,i) = dist(s,t, G). In other words, when we refer to the distance between s and 
t in some subgraph H of G, we will always state the subgraph explicitly and write dist(s, t, H). Otherwise, 
if we write dist(s,i) we mean dist(s, i, G)]. 

The approximate distance oracle is said to be of stretch k, or a ^-approximate distance oracle, if for 
every two nodes, s and t, the reported distance dist(s,t) between s and t satisfies dist(s,i) < dist(s,t) < 
k ■ dist(s, t). 

Usually, the key concerns in designing approximate distance oracles are to minimize the size of the data 
structure, to minimize the stretch, and to minimize the query time. 

Distance oracles have been extensively studied. They were first introduced by Thorup and Zwick in a 
seminal paper [17| . Thorup and Zwick showed how to construct for a given integer fc > 1, a (2fc — 1)- 
approximate distance oracle of size 0{kn^^^/'') that can answer distance queries in 0{k) time. Thorup and 
Zwick [17| showed that their space requirements are essentially optimal assuming the girth conjecture of 
Erdos |3] . Thorup and Zwick also showed how to dcrandomizc this construction, but at the cost of increasing 
the preprocessing time and slightly increasing the size. Roditty, Thorup. and Zwick |14| later improved 
this result, presenting a faster deterministic construction and reducing the size of the data structure to 
0[k'n}~^^/'') as in the randomized construction. Further improvements on the construction time were later 
introduced in [41513] . For further results and lower bounds see also 1911511011] . 

In this paper, we consider a natural variant of the approximate distance oracle problem for vertex-labeled 
graphs. We are given an undirected weighted graph, G = {V,E), where each vertex, v, is assigned a lahel^ 
A(w), where \{v) belongs to a set L = {Ai, A^} of ^ < ?i distinct labels. The goal is to construct a compact 



data structure that, given a node v and a label A G L, can quickly return an approximation to the distance 
dist(?;,A), where dist(w,A) is the minimal distance between v and a A-labeled node in G. This interesting 
variant of distance oracles was introduced by Hcrmelin, Levy, Wcimann and Yuster [S]. The labels of the 
nodes often represent some functionality (or resources). In some settings, the natural question is not what is 
the distance between two given nodes, but rather what is the distance between a given node and some desired 
resource. For example, the nodes may represent cities and the labels may represent some public resources 
such as hospitals, courts, universities, and so on. 

Hermelin et al. [S] mention that there is a simple solution for this problem: store a table of size n ■ £, 
where the entry (v, A) represents the distance dist(i;, A). This data structure is of size 0{ni), the query time 
is 0(1), and the stretch is 1 (exact distances). As shown in [8] this table can be constructed in 0{m£) time. 
However, as is also mentioned in [5], this data structure suffers from two main drawbacks. First, in many 
applications, 0(n£) might be still too large, and in such applications it might be preferable to store a more 
compact data structure at the price of approximate distances. Second, in some settings it might be desirable 
to allow label changes and it is not clear how to efficiently handle label changes using the above mentioned 
data structure. 

Hermelin et al. [S] present several results for distance oracles for vertex-labeled graphs problem. In their 
first result, Hermelin et al. show how to construct a vertex-label distance oracle of expected size 0{kn^^^/'') 
with stretch (Ak — 5) and query time 0{k). This result is unsatisfactory when t. is very small, especially when 
I = o(ni/'=). In this case, the trivial 0{n£) solution gives a smaller data structure with exact distances. To 
overcome this issue, they propose a second data structure of size 0{kn(.^/^). This, however, comes at the 
price of a huge stretch factor of 2*^ — 1. In a third result, they present a dynamic vertex-label distance oracle 
that is capable of handling label changes in sub-linear time. More specifically, they show how to construct 
a vertex-label distance oracle of expected size 

0(/c„i+i/fc) and with stretch (2 ■ 3'=-^ -f 1) that can support 
label changes in 0{kin}^^\og\ogn) time and queries in 0{k) time. 

Note that in the latter two results, the stretch depends exponentially on fc. In this paper, we address an 
important question they left open, namely, is it possible to improve this dependence on k from exponential 
to polynomial. More specifically, we prove the following theorems. 

Theorem 1. A vertex-label distance oracle of expected size 0{knfi/'') with stretch (4fc - 5) and query time 
0{k) can be constructed in 0{m ■ mm{n''^^'^''~^\ £}) time. 

Theorem 2. A vertex-label distance oracle of expected size 0(ri^^^/'"') with stretch (4fc — 5) and query time 
0{k) can be constructed in 0(kmn^/^) time and can support label changes in 0(n^/'^' log^"^^*" ri log log n) 
time. 

A closely related notion of distance oracles is that of spanners. A subgraph H is said to be a fc-spanner 
(or a spanner with stretch fc) of the graph G if dist(u,w,i7) < fc • dist(u,?;,G) for every u,v € V{G). 
Here and throughout, V{G') denotes the set of vertices of graph G", and similarly, E{G') denotes the set 
of edges of graph G". A well-known theorem on spanners is that one can efficiently construct a (2fc — 1)- 
spanner with 0(^1+1/'=) edges H. This size-stretch tradeoff is conjectured to be optimal . The notion of 
spanners was introduced in the late 80's |11I12| . and has been extensively studied. Spanners are used as a 
fundamental ingredient in many distributed applications (e.g., synchronizers |12j . compact routing jl3ll6j . 
broadcasting [7], etc.). 

This paper also introduces a natural extension of spanners, spanners for vertex-labeled graphs, and 
presents efficient constructions for such spanners. Consider an undirected weighted graph G = (V, E), where 
each vertex v G V is assigned a label from a set of labels L = {Ai,...,A£}. We say that a subgraph 77 is a 
vertex-labeled k-spanner (VL k-spanner) of G if dist(ti. A, iJ) < fc • dist(w. A, G) for every node u & V and 
label A G i. It is not hard to verify that every fc-spanner is also a VL fc-spanner. However, one may hope to 
find sparser yL-spanners when the number of labels is small. A naive approach would be to create for each 
label A an auxiliary graph G\ by adding a new node sx and then connect sx to all A-labeled nodes with 
edges of weight 0. It is not hard to verify that by invoking a shortest path algorithm in every Gx from sx 
and taking the union of all these shortest-paths trees (removing the nodes sx and their incident edges) , the 
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resulting subgraph is a VL 1-spanner (preserving the exact distances) with 0{n£) edges. However, the 0{n£) 
spanner's size may stiU be too large in many settings , and one may wish to reduce the size of the spanner 
at the price of approximated distances. Ideally, one would wish to find a VL (2k — l)-spanncr with 0{nl^/'^) 
number of edges (beating these bounds yields improved trade-off for the standard spanners). We managed 
to come close this goal, presenting an efficient construction for VL spanners with stretch close to (4fc +1) 
and with 0{n£^/'^) number of edges. More specifically, we prove the following theorem. 

Theorem 3. For every weighted graph G with minimal edge weight 1 and fixed parameter e > 0, one can 
efficiently construct a vertex-label (4fc + 1)(1 + e)-spanner with 0(logn • log I? • n£^^^) edges, where D is the 
diameter of the graph. 

We note that our constructions for vertex-labeled distance oracles and the constructions presented in [8] 
do not seem to transform well to also give spanners. Therefore, our vertex-labeled spanner constructions use 
different techniques and require some new ideas (and this is the technically more involved part of this paper. 
Section IH). 

The rest of the paper is organized as follows. In Section [2] we prove Theorem [T] In Section [3] we prove 
Theorem [5] In Section 2] wc prove Theorem [31 for simplicity, we first present a construction for unweighted 
graphs in Subsection 14. II and then show how to generalize it to weighted graphs in Subsection 14.21 

2 Compact Vertex-Label Distance Oracles 

In this section we prove Theorem [T] In Subsection 12. II wc present the construction of our data structure, in 
Subsection 12 . 21 we present our query answering algorithm, and in Subsection 12 .31 we analyze the construction 
time. 

2.1 The Data Structure 

The first step of the construction of the data structure is similar to the algorithm presented in [8] . For a given 
positive integer k, construct the sets V ~ Aq D Ai D ■ ■ ■ ^ A^^i as follows: The i-th level Ai is constructed 
by sampling the vertices of Ai-i independently at random with probability £~^/^ for 1 < i < fc — 1. 

Next, for every vertex w, define the bunch of v exactly as the Thorup-Zwick definition, but with a small 
change: that is, omit the last level, namely 

fc-2 

B{v) = IJ {w G Ai \Ai+i I dist(?;,'u) < dist(w, Ai+i)}. 

4=0 

The pivot Pi{v) is also exactly as Thorup-Zwick's definition, namely Pi{v) is the closest node to v in Ai 
(break ties arbitrarily). 

Next, for every node v 6 Ak-i, store its distance for every label A G L, namely dist(u. A) in a hash table. 
Finally, for every label A G L, store B{\) = UixeVa ^(^) ^ hash-table and for every node x G B{\) store 
dist(a;. A). 

This completes the description of our data structure. 
Below, we bound the size of the data structure. 

Lemma 1. E[B(w)] = (fc - 1)^^^. 

Proof: Using the same analysis as Thorup-Zwick's, one can show that the expected size of B(v)r\{Ai\Ai^i)^ 
for 1 < I < fc — 2, is stochastically dominated by a geometric random variable with parameter p = . 
Hence, E[B(w) n A, \ = f^/^ We thus get that ^\B(v)\ = (k - l)^'^. | 

Lemma 2. The expected size of our data structure is 0{kn£^/^). 
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Proof: By Lemma [TJ the total expected size of the bunches of all nodes is (k — l)n(^/^. In addition, for 
every node in A^-i we also store its distance to every label, X £ L. That is, for every node in Ak-i, 
we store additional data of size £. The expected size of Ak-i is nf"''"'"-'^'/'^. To see this, note that the 
probability that a node v belongs to Ai is Therefore, the total additional expected size stored for all 

nodes in Ak-i is nl^l^ . Finally, storing B(X) for every A G L docs not change the asymptotic size since 
E |5(A)| = E l^(^^)l= E The lemma follows. | 

AgL AgLuGVa v& 

2.2 Vertex-Label Queries 

We now describe our query answering algorithm, with the input vertex-label query, G y, A G L). 

The query answering algorithm is done as follows. For every index i from to fc — 2, check if G B{X), 
and if so return dist{v, pi{v)) + dist{pi{v) , A). Otherwise, if no such index exists, return dist{v,pk-i{v)) + 
dist(p^:_i(t;). A). This completes the query answering algorithm. 

We now turn to the stretch analysis. If there exists an index i such that < i < k — 2 and Pi{v) G B{X), 
set i to be the first such index. If no such index exists, set i = k — 1. Let u be the A-labeled node closest 
to V, namely dist(?;,u) = dist(i;, A). Note that Pj{v) ^ B{u) for every j < i. This is due to the facts that 
Pj{v) ^ ^(A) and that B{u) C B{X). Using the same analysis as in [TB] (Lemma A.l), one can show that 
dist{v, pi{v)) < {2i — 2)dist(w,u) and dist{pi{v), X) < dist{pi{v),u) < {2i — l))dist(iJ, u). Wc get that 
the returned distance dist(w, A) satisfies dist(w, A) — dist{v, pi{v)) + dist{pi{v), X) < {ik — 3)dist(i;,u) = 
(4A:-3)dist(i;,A). 

Note that if z < A: — 2, then the distance dist(pi(u). A) is stored in B{X), or, if i = fc— 1 thcnpi{v) G A^-i 
and recall that dist(u, A) is stored for every u G Ak-i and therefore also dist(pi(w). A) is stored and can be 
retrieved in 0(1) time. 

Finally, using the same method as in [16] (Lemma A.2) the stretch can be reduced to 4fc — 5 as required. 

We note that Hermelin et al. [8] have to check all indices. Namely, their query algorithm is to return 
the minimal distance dist(w,w) + dist(w,u;A) for all w = Pi{v) such that iv G B{X), where we define w\ 
to be the A-labeled node closest to w that satisfies w G B{w\). Let u be the A-labeled node that satisfies 
dist(w,M) = dist(i;. A). Hermelin et al. |5] note that the first w ~ Pi{v) G B{X), does not necessarily satisfy 
dist(u',u>A) < dist(i(;,u) since there is a possibility that w ^ B{u). Therefore, they have to iterate over all 
indices 1 < i < fc — 1 and take the one that gives the minimal distance. We bypass this issue by simply 
explicitly storing the distance dist(u>. A) — rather than dist(u>,u'A) — for every w G B{X). This does not 
increase the asymptotic size and it simplifies the query algorithm and its analysis. 

2.3 Construction Time 

The preprocessing time of our construction is composed of the time it takes to construct the different 
components of our data structure. Recall that our data structure is composed of four components. The first 
component is the pivots: for every node v we store Pi{v) for 1 < i < fc — 1. The second component is the 
bunches of the vertices: for every node v we store B{v) and the distances dist(i;, x) for every x G B{v). The 
third component is the bunches of the labels: for every label A we store B{X) and the distances dist(a;, A) 
for every x G B{X). The fourth part is the distances of the nodes in Ak~i to all labels: store dist(i;, A) for 
every v G A^-i and X G L. 

Using the same analysis as in |17| . one can show that the time complexity for constructing the first 
component is 0{k ■ m) and the time complexity for the second component is 0{kml^/''). 

Constructing B{X) for every A G L (the first part of the third component) can be done easily in 0{knl^/^) 
time [just go over all nodes d, and add B{v) to B{X{v))]. 

We are left with computing dist(a;. A) for every x G B{X) and then for every x G Ak-i and X E L. This 
can be done by invoking Dijkstra's Algorithm £ times (for every label X £ L add a source node s and connect 
all A-labclcd nodes to s with an edge of weight and then invoke Dijkstra's Algorithm from s) and thus the 
running time for this part is 0{m£). 

We get that the total running time for the preprocessing phase is 0{m£). 
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We note here that if ^ > n''/('^''~'^\ then it is possible to reduce the preprocessing running time to 
0(mn'^/'^^'^~^^). This can be done by storing dist(w,WA) as suggested in [8] instead of storing dist(?j, A) for 
every v G B{X). This change forces checking all indices in the query algorithm as explained above. The 
analysis of the preprocessing time in this case is similar to the one presented in |5] . 

3 Dynamic Labels 

In this section, we consider the problem of constructing a dynamic vertex-label distance oracle and prove 
Theorem [51 Namely, we show how to construct a vertex- label distance oracle that supports label changes of 
the form update{v, A) for v € V and X € L. This update changes the label of w to be A and leaves all other 
nodes unchanged. Our data structure in this section is a slight adaptation of Thorup-Zwick's construction 
and is also similar to the one presented in [5| for static vertex-label distance oracles. 

In Subsection 13 . 1 1 we present the data structure, Subsection 13.21 presents the query answering algorithm 
and in Subsection 13.31 we analyze the construction time. 

3.1 The Data Structure 

For a given positive integer k, construct the sets V ^ Aq D Ai ^ ■ ■ ■ D Ak^i ^ Afc = as follows. The z-th 
level Ai is constructed by sampling the vertices of Ai^i independently at random with probability p to be 
specified shortly for 1 < i < fc — 1. 

The bunch of v is defined as in Thorup-Zwick as follows: 

fe-i 

B{v) = IJ {m e Ai\Ai+i I dist(iJ,u) < dist(i;, 

The pivot Pi{v) is also defined exactly as Thorup-Zwick's definition, namely Pi{v) is the closest node to v in 
Ai (break ties arbitrarily). 

In order to allow fast updates, the size of every bunch B[v) for v must be small. In order to ensure 
this property, we set the sampling probability to be p = [n/ Inn)^^/^ . It was proven in [T7] that by setting 
p = (n/ lnn)~^/'^, the size of every bunch B(v) is 0{n^l^\o^~^^^ n) with high probability . 

In addition, for every A G i, store B(X) = UueVA ^(''^) ^ hash-table. Recall that in the static setting 
we store dist(v,A) when v G B(X). In the dynamic setting, we do not store this data as it is too costly to 
update it for two reasons. First, notice that a single label change, say from Ai to A2, might require updating 
dist(w,Ai), dist(ii, A2) for many nodes v G B(\\) and u G BiX-i). As both B(\\) and BiX-i) might be very 
large, this may take a long time. Second, even a single update of dist(w. A) might be too costly as it might 
require invoking a shortest path algorithm during the update phase. 

To avoid the need of updating dist(u. A) for a node v G B{\), we do the following two things. First, 
rather than maintaining the value dist(?;,A), we instead maintain the value dist(w,'(;A) where vx is defined 
to be the closest A-labeled node such that v G B{vx). Second, we use the method of [5] and iterate on all 
indices 1 < i < k — 1 and return the minimal distance dist(w, w) -|-dist(w, wx) for w ~ Pi{v) in the answering 
query algorithm. 

In order to maintain the value dist(u, wx) for a node v G B{X), we store the set of A-labeled nodes x such 
that V belongs to B{x) in a heap, Heap{v, A), namely the set of nodes in the Heap{v, A) is V{Heap{v, A)) ~ 
{x € V \ V € B(x) and X(x) = A} where the key, key{x), of a node, x G V{Heap{v, X)), is the distance, 
dist(w,x). The heap, Heap{v^X), supports the standard operations of [insert{x) - insert a node x to the 
heap], [remove{x) - remove a node x from the heap] and [minimum{) - return the node x in the heap with 
minimal key{x)]. For this purpose, we use any standard construction of heaps (e.g. |18p that allow insert 
and remove operations in 0(loglogri,) time and minimum{) operations at constant time. 

We now summarize the different components of our data structure to make it clear what parts of the 
data structure need to be updated as a result of a label change. 
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(1) For every node v, store B{v) and for every node x G B{v), store dist(?;,x). This data is stored in a 
hash-table, which aUows checking if a node x G B{v) and, if so, finding dist(w,a;) in time. 

(2) For every node v and index 1 < i < fc — 1, store Pi{v). 

(3) For every X G L, store B{X) in a hash-table where the entry in the hash-table of a node v £ B{X) points 
to the heap, Heap{v, A). 

It is not hard to see that only component (3) in our data structure needs to be modified as a result of 
a label change. Moreover, if the label of some node v € V is changed from Xi £ L to X2 & L, then only 
B{Xi) and B{X2) need to be updated. The update is relatively simple. For every node x £ B{v), do the 
following: Remove v from Heap{x,Xi). If Heap{x,Xi) becomes empty, then also remove x from the hash- 
table of B{Xi). In addition, if a; 6 B{X2), then add v to Heap{x, X2), otherwise add x to the hash-table 
B{X2) and create a new heap, Heap{x, A2), containing only v. Each such operation takes O(loglogn) time 
for every x £ B{v). Recall that the size of B{v) is 0(jn}/^ log^"^^*" n); thus we get that the update requires 
0(n^/'^ log^"^/'^ nloglogn) time. 

It is not hard to verify that the size of the data structure is 0( ^ \B{v)\) = 0(n^+^/'^ log^"^^'^ n). 

3.2 Vertex-Label Queries 

The query answering algorithm is similar to the one presented in Section [51 Let (u G V, A G L) be the input 
vertex-label query. 

The query answering algorithm is done by checking all indices 1 < i < fc — 1 and returning the minimal 
d\st{v ,Pi(v))+Aist(pi(y) , w\) such that Pi(y) G B[X). w\ is the node returned by Heap(j)i{v), X) .minimum{) , 
namely, w\ is the A-labeled node such that Pi{v) G B{X) with minimal dist(pi(w), 

Note that here we must check all indices and cannot stop upon reaching the first index j, such that 
Pj{v) G B{X). Let u be the A-labeled node closest to v, namely dist(w, u) = dist(w. A). As mentioned 
by Hermelin et al. [8] (and discussed above), the first w = Pi{v) G B{X), does not necessarily satisfy 
dist(u',WA) < dist(w,M) since it may be that w ^ B{u). Therefore, we also have to iterate over all indices 
\ < i < k — 1 and take the one that gives the minimal distance. 

It is not hard to verify that the query algorithm takes 0(k) time. 

Finally, as mentioned in Section [2l using the same analysis as in [16] (Lemma A.l), the stretch is (4fc — 3), 
and using the same method as in |16| (Lemma A.2), the stretch can be reduced to 4fc — 5 as required. 

3.3 Construction Time 

The first two components of the data structure are exactly the same construction as Thorup-Zwick's 
and thus can be constructed in 0{km'n}/^) time. The third component can be constructed in 0{\ iJv^v 
B{v) \ ■ log log ri) = 0(fcri,^+^/'"" loglogn) time (the log log n comes from the insertion to the heaps). Wc con- 
clude that the total preprocessing time is 0(knm}^^). 

4 Sparse Vertex-Label Spanners 

In this section, we shall address the question of finding low stretch sparse vertex-label spanners. More 
specifically, we show how to find a subgraph H with expected number of edges 0{nli^/^) such that for every 
vertex v and label A, dist(w. A, H) < (4fc -I- 1)(1 -I- e)dist(t;. A, G) for any fixed < e. Note that it is unclear 
how to transform the construction of Section [2] into a vertex-label spanner. To see this, recall that for every 
node V in A^-i and for every label A wc store the distance dist(w. A). However, in order to allow a low-stretch 
spanner, we need to add a shortest path P from v G Ak-i to its closest A-labclcd node. This path could be 
very long and, of course, may contain many nodes not in A^-i- Thus, adding all these paths may result with 
a subgraph with too many edges. Therefore, transforming the construction of Section [2] into a vertex-label 
spanner seems challenging. Wc hence suggest a different construction for spanners. 

For simplicity, wc first present (Subsection l4.ip a construction for unweighted graphs and then (Subsection 
14. 2p we show how to generalize this construction to weighted graphs. 
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4.1 Unweighted Graphs 

For a node u, radius r, and subgraplr H, let B{v, r, H) = {x <E V{H) \ dist(u, x, H) < r}. 

We start by describing an algoritlim named VL_Cover, tlrat wlien given a distance d, it returns a 
spanner H with the following property. For every node v G V and label X G L, such that dist(i), A, G) < d, 
dist{v,X,H) < (4fc + 

Loosely speaking, the algorithm proceeds as follows: It consists of two stages. The first stage handles 
"sparse" areas, namely, balls around some node v such that \B{v, kd, G)\ < d - (.'^^^^^ . In this case, we show 
that we can tolerate adding a BFS tree T{v) spanning B{v,id,G) for some 1 < i < fc, charging the nodes 
in B{v,{i — l)d,G). It is not hard to verify that every path P of length at most d that contains a node in 
B{v, {i — l)d, G) satisfies V{P) C B{v, id, G). Using the tree T{v), we have a "short" alternative path to P. 
The second stage handles "dense" areas, namely, balls around some node v such that \B{v, kd, G)\ > d-l^'^/'^. 
The algorithm picks a set C C V such that the distance between every two nodes in C is at least 2kd, and 
that every node in a "dense" area has a "close" node in C. In this case, we can tolerate adding 0{d ■ t) edges 
for every node c £ C, charging the nodes in B{c, kd, G). The algorithm connects every node u £V to some 
"close" node c G C by a shortest path. In addition, for every label A such that the distance from c to A is 
0{d), we add a "short" path from c to A. In this case for every node u and label A such that dist(u. A) < d, 
we have a "short" alternative path by concatenating the path from u to its "close" node c G C and the path 
from c to A. 

We now describe the algorithm more formally. 

Initially, set G' = G, Hd = {V, 0), and C = 0. The algorithm consists of two stages. The first stage of the 
algorithm is done as follows. As long as there exists a node v G V{G') such that \B{v, kd, G')\ < d ■ , 
pick V to be such a node. Let i be the minimal index such that \B(v,id,G')\ < d ■ Construct a 

shortest-path tree T{v) rooted at v and spanning B{v, id, G') and add the edges of T[v) to Hd- If i > 1, then 
remove the nodes B{v, {i — l)d, G') from G'; if z = 1 remove the nodes B{v, d, G'). 

The second stage of the algorithm is done as follows. As long as there is a node v £ V such that 
B{v, 2k ■ d, G') n C = 0, pick v to be such a node, and add it to C . For every node c G C, do the following. 
First, let B{c) be all nodes u in G' such that c is closer to u than any other c' G C. (We assume unique 
shortest paths. This is without loss of generality, as one can artificially create differences between the paths 
by slightly perturbing the input to ensure uniqueness.) Second, construct a BFS tree rooted at c and spanning 
the nodes in B{c) and add the edges of the BFS to Hd- Third, for every label A G L, if there exists a node 
V G B{c) such that dist(w, A, G) < d, then pick v to be such a node and add a shortest path P{v, A) from v 
to its closest A-labeled node, and add the edges of the path P{v, A) to Hd- 

This completes the construction of the spanner. See Figure [T] for the formal code. 

We now turn to analyze the stretch and the number of edges in the resulting spanner Hd- 

Consider a node v that is picked in the "while" loop of the first stage of the algorithm. Let G'{v) be the 
graph G' in the algorithm just before v (and the ball around it) was removed from G". 

Lemma 3. The number of edges in Hd is 0{nl^/^). 

Proof: The algorithm adds edges in three different locations. 

The first location is in the first stage of the algorithm. The second location is edges in the shortest-path 
trees spanning B{c) for every c G C. The third location is edges on paths P{y, A) for some node c G C, A G L, 
and y G B{c)- We now show that the number of edges added in each of the three locations is 0{nl^/^)- 

Consider the first location. Let w G F be a node that is picked in the first stage of Algorithm VL_Cover 
and \etl<i<k be the minimal index such that \B{v,id,G' {v))\ < d ■ &-^'^/''. A BFS tree T{v) rooted 
at V and spanning B{v,id,G'{v)) is added to Hd- Namely, \B{v,id,G' (v))\ < d ■ i'-'-'^^/'' ed ges are added 
to Hd- We consider two cases, first, when i > 1, and second, when i = 1- Consider the first case. Note 
that by the minimality of the index i, \B{v, (i — l)d,G'{v))\ > d ■ |-j^^g ^.^^ charge the nodes 

in B{v, {i — 1) ■ d, G') with the edges in T{v) added to Hd- Note that every node in B{v, (i — 1) • d, G') is 
charged with at most f-^^ edges. Moreover, the nodes in B{v, (i — 1) • d, G') are removed from G' and thus 
no node is charged twice. Consider the second case. Recall that in this case, a BFS tree T{v) is added to 
Hd- We can charge the nodes in B{v, d, G') with the edges in T{v) added to Hd, charging each node with a 
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Procedure VL_Cover(G, d) 

G' ^G,Hd^ {y,%), 
**** Stage 1 **** 

while 3 node v G V(G') such that \B{v,kd,G')\ < d-t''^'^ do: 
let i be the minimal index such that \B{v, id, G')\ < d ■ 
construct a shortest-path tree T{v) rooted at v and spanning B(v,id, G') 
add the edges of T{v) to Hd 
if i > 1 then 

remove the nodes B(v,(i — l)d,G') from G' 
else (i = 1) 

remove the nodes B(v,d,G') from G" 

**** Stage 2 **** 

while 3v e V{G') such that B{v, 2k-d,G')nG = 9 do: 

G ^Gu{v} 
for every node c G C do: 

let B{c) = {u G V{G') I dist(u,c) = dist(u,C)} 

construct a shortest-path tree T{c) rooted at c and spanning B{c) 

add the edges of T(c) to Hd 

for every label A G L such that 3y G B{c) such that dist(?/, \) < d do: 
pick y to be such a node 
add E{P{y,X,G)) to Hd 
return Hd 



Fig. 1. Constructing vertex-labeled spanners for unweighted graphs 

single edge. Moreover, the nodes in B{v,d,G') are removed from G' and thus no node is charged twice. We 
thus conclude that the number of edges added in the first location is 0{ni^/^). 

Consider the second location. Note that every node belongs to exactly one set B{c) for some c G C. In 
addition, note that due to unique shortest paths, for every node u G S(c), V{P{u, c)) C B{c). We thus add a 
single edge for every node v €V for this stage. We obtain that 0{n) edges are added for the second location. 

Finally, consider the third location. Let c G C. Note that \B{c,k ■ d,G')\ > d ■ The number of 

edges added for a path P{y, A) for some X € L and y G B{c) is at most d, since dist(j/. A) < d. There are at 
most i labels, therefore, at most £ ■ d edges are added for the node c. We charge the nodes in B{c, k ■ d, G') 
with these edges, charging each node in B{c, k ■ d, G') with at most 0{(.^/^) edges. Note that since the nodes 
in C are at distance at least 2kd + 1 from one another, no node is charged twice. We thus conclude that the 
number of edges added for the third location is 0{nf-/^). 

The lemma follows. | 

Lemma 4. For every node u ^ V and label X G L such that dist(u. A) < d, dist(u. A, Hd) < (4fc + l)d. 

Proof: Consider a node u V and a label X G L such that dist(u. A) < d. 

Let P(u, A) be the shortest path from u to its closest A-labeled node u\. We consider two cases. First 
case is when some node y G V{P{u, A)) is deleted from the graph G'. The second case is when none of the 
nodes on P{u, A) is deleted from the graph G". 

Consider the first case. Let y be the first node on P{u, A) that is deleted from G" by the algorithm. Let 
V be the node picked by the "while" loop of the first stage of the algorithm, such that y is removed in v's 
iteration. Let i be the minimal index such that \B{v, id, G')\ < x ■ . 

We consider two subcases. First, when z = 1 and second, when z > 1. Consider the first subcase. 
Note that ViP) ^ ViG'iv)), since y is the first node in V{P) that is removed from G' . We claim that 
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V{P) C B{v, d, G'{v)), and show it as follows. Assume, for the sake of argument, that V{P) ^ B{v, d, G'{v)). 
This implies that there there must exist a node w such that dist(?;, w, G'{v)) = d. But, note that the shortest 
path P = P{v, w, G'{v)) from u to w in G'{v) contains d nodes and that V{P) C B{v, d, G'{v)). We thus get 
that \B(v,d,G'{v))\ > d, contradiction. Recall that a BFS tree T{v) rooted at v and spanning B{v,d,G') is 
added to Hd- We get that dist(u. A, iJd) < dist{u, X,T{v)) < dist{u,v,T{v)) + dist{v,ux,T{v)) < 2d. 

Consider the second subcase. Note that before B{v,{i — 1) • d,G') is removed from G", V{P{u,X)) C 
V{G'{v)), since we assume that y is the first node on P{u,X) that is removed from G". Recall that, a BFS 
tree T(u) rooted at v and spanning B{v, i ■ d, G") is added to Hd- Note also that V{P{u, A)) C B{v, i ■ d, G') 
since y £ B{v, (« — 1) • d, G') and dist(?/, z) < d for every z G V{P{u, A)). Specifically, u, ux G B{v,id, G'{v)). 
We get that, dist(u. A, Hd) < dist(M, A, T{v)) < dist(u, v, T{v)) + dist(w, ux, T{v)) < 2i ■ d < 2k ■ d. 

Consider the second case. We consider again two subcases. The first subcase is when u G G. The second 
subcase is when u ^ G. 

Consider the first subcase. Note that dist(M, A) < d and clearly u e B{u). Therefore, a path P{y,X) is 
added to Hd for some y G B{u). We get that, dist(7i. A, Hd) < dist(w, y, Hd) + dist(y. A, Hd) < 2kd + d = 
(2fc + 

Consider the second subcase. The node u is not removed from G', and furthermore u ^ G. Note that 
this could only happen when u is at distance at most 2kd from some node in G. Let c G G be the closest 
node to u. Note that u G B{c) and thus a shortest path from it to c is added to Hd (as part of T(c)). 
Moreover, dist(u. A) < d. Hence, a shortest path P{y,X) is added to Hd for some y G B{c). We get that, 
dist(M, A, Hd) < dist(u, c, Hd) + dist(c, y, Hd) + dist(y, A, Hd) < 2kd + 2kd + d = {4k + l)d. | 

The main algorithm for constructing our spanner operates in log n iterations. For a given fixed parameter 
e, for every index 1 < i < logn, invoke Algorithm VL_Cover with parameter d{i) = (1 + e)'. Let Hd{i) 
be the subgraph returned by the Algorithm VL_Cover. Let H be the union of all subgraphs Hd(i) for 
1 < i < log n. This completes our spanner construction. 

It is not hard to verify that by Lemmas [3] and lU we have the following. 

Theorem 4. For every unweighted graph G and fixed parameter e , one can efficiently construct a vertex-label 
(4fc + 1)(1 + e)-spanner with 0([ogn ■ ni^/^) edges. 

4.2 Weighted Graphs 

In this section we generalize our spanner construction for weighted graphs. 

Note that in the unweighted case we exploit the fact that a path P of length d contains d edges. In the 
weighted case, this is no longer the case. A path of length d could potentially contain a much smaller or larger 
number of edges. We thus need to be more careful with the paths we add to the spanner. Roughly speaking, 
for every potential distance d and index j, we consider nodes that have at least 2^ nodes at distance d from 
them. In this case, we can tolerate adding paths with 0(2-') number of edges. 

For a path P, let \P\ be the number of edges in P and let dist(P) be the length of P. Let dist(i', u, x' , H) 
be the minimal length of a path from u to v in H among all paths with at most x' edges. Let P{u, v, x' , H) 
be the shortest path in H between u and v among all paths with at most x' edges. We say that a node v is 
(a;', d)-relevant in H ii x' < 15(11,^,^^)1. We say that a path P is (a;', (i)-relevant if x' < \E{P)\ < 2x' and 
dist(P) < d. 

As in the unweighted case, we first describe an algorithm named WVL_Cover that given a distance d, 
an integer and a graph G, returns a subgraph Hd.x that satisfy the following. For every node v and label 
A G i such that there exists an (cc, (i)-relevant path P from u to a A-labeled node, dist(t;. A, Hd,x) < (4fc + l)(i. 

The algorithm proceeds as follows. Initially, set G' <— G, Hd^x *^ ^■,^)^ and G <— 0. There arc two 
stages. The first stage of the algorithm is as follows. As long as there exists an (x, (i)-rclcvant node v in 
G' such that |_B(u, fed, G')| < x ■ , pick v to be such a node. Let i be the minimal index such that 

i?(?;, id, G')| < X ■ ^('"i)/*^. Construct a shortest-path tree T{v) rooted at v and spanning B{v,id,G'), and 
then add the edges of T{v) to Hd^x- Finally, remove the nodes B{v, {i — l)d, G') from G'. 

The second stage of the algorithm is done as follows. As long as there exists an (cc, d)-relevant node 
V in G' such that B{v, 2k ■ d, G') fl G = 0, add v to G. For every node c G G do the following. First, 
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let B{c) = {u G V{G') I dist(it,c) = dist(u, C)} (recall that we assume unique shortest paths). Second, 
construct a shortest-path tree T(c) rooted at c and spanning B{c), and then add the edges of T{c) to Hd.x- 
Finally, for every label A S L such that 3y S B{c) and dist(?/, A, 2x) < d, add E{P{y,X,2x,G)) to H^^x- 
This completes the construction of our spanner. See Figure [5] for the formal code. 



Procedure WVL_Cover(G, d, x) 

G' ^ G, f/d,, {y,%), 
**** Stage 1 **** 

while 3(2;, d)-relevant node v £ G' such that \B{v, kd, G')\ < x ■ l^-^l^ do: 
let i be the minimal index such that \B{v, id, G')\ < x ■ ^('-1'/'= 
construct a shortest-path tree T{v) rooted at v and spanning B{v,id, G') 
add the edges of T{v) to Hd,x 
remove the nodes B{v, {i — l)d, G') from G' 

**** Stage 2 **** 

while 3 {x, d)-relevant v e V{G') such that B{v, 2/c • d, G") n C = do: 

c ^cu{v} 

for every node c £ C do: 

let B{c) = {u e V{G') I dist(u,c) = dist(u,C)} 

construct a shortest-path tree T{c) rooted at c and spanning B{c) 

add the edges of T(c) to Hd,x 

for every label X £ L such that 3y G B{c) such that dist(y. A, 2x) < d do: 
pick y to be such a node 
add E{P{y,X,2x,G)) to Hd,, 
return Hd,a: 



Fig. 2. Constructing vertex-labeled spanners for weighted graphs 



Lemma 5. For every node u £V and label X £ L such that there exists an (x,d) -relevant path P from u to 
a X-labeled node, dist(it, X,Hd,x) < (4fc + l)d. 

Proof: 

Consider a node u £ V and a label X G L such that there exists an (x, d)-rclcvant path P from u to a 
A-labeled node u\. 

We consider two cases. The first case is when some node y £ V{P) is removed from G" in the first stage 
of the algorithm. The second case is when none of the nodes in V{P) is removed from G' in the first stage 
of the algorithm. 

Consider the first case. Let y be the first node in V{P) that is removed from G' by the algorithm. Let 
V be the node that is picked by the "while" loop of the first stage of the algorithm such that B{v, {i — 
l)d,G'{v)) is removed from G' and that y £ B{v,{i - l)d,G'{v)). Note that V{P) C V{G'{v)), since y is 
the first node in V{P) that is removed from G'. Recall that a shortest-path tree T{v), rooted at v and 
spanning B{v,id,G'{v)), is added to iJ^.x- Note also that V{P) C B{v,id,G'{v)). To see this, recall that 
y £ B{v,{i — l)d,G'{v)), namely, dist{v, y,G'{v)) < (i — l)d. In addition, dist{y , z , G' (v)) < d for every 
z £ V{P). Hence, dist(w, z, G"(w)) < id for every z £ V{P). Specifically, u,u\ £ B{v,id,G'{v)). We get that 
dist(M, A, Hd^x) < dist(M, A, T{v)) < dist(M, v, T{v)) + dist(v, u\, T{v)) < 2i ■ d < 2k ■ d, as required. 

Let G" be the graph G' at the end of the first phase. 

Consider the second case. We consider two subcases. First when u £ C, and second, when u ^ C. 
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Consider the first subcase. Note tliat dist(M, A, 2x) < d and clearly u £ B{u). Therefore, a path P = 
P{y, A, 2x, G) is added to Hd^x for some y G B{u). Thus, dist(u. A, Hd^x) < dist(u, y, i7(i,2;)+dist(y. A, Hd^x) < 
2kd + d = {2k + l)d. 

Consider the second subcase, where the node u ^ C. Note that V{P) C V{G"), since none of the nodes 
in V{P) is removed from G' by the algorithm. In addition, V{P) C B{u,d,G"). Therefore, \B{u,d,G")\ > 
\V{P)\ > X. Hence, by definition, v is (cc, (i)-relevant in G" . The node u is not added to C by the algorithm, 
this could happen only if there is another node w £ C such that dist(w, u, G') < 2k ■ d. Let c £ V{G") be 
the node such that u £ B{c). Note that dist(c, u, G) < dist(c, u, G") < 2k ■ d. Moreover, dist(u. A, 2x, G) < 
d. Therefore, a path P{y,X,2x,G) is added to Hd^x for some y £ B{c). We get that, dist{u, X, Hd^x) < 
dist(M, c, Hd^x) + dist(c, Hd^x) + dist(?;, A, Hd,x) < 2kd + 2kd + d = {Ak + l)d. | 

Lemma 6. The number of edges in Hd^x is 0[ni^/^). 

Proof: The algorithm adds edges in three different locations. The first location is in the first stage of the 
algorithm. The second location is edges in the shortest-path trees spanning B{c) for every c £ C. The third 
location is edges on paths P{y, A, 2x, G) for some node c G C, X € L, and y £ B{c). We now show that the 
number of edges added in each of the three locations is 0{ni^/^). 

Consider the first location. Let w £ y be a node that is picked in the first stage of Algorithm WVL_Cover 
and let 1 < z < fc be the minimal index such that \B{v,id^G' {v))\ < d ■ ^('"i)/*^. A shortest-path tree T{v) 
rooted at v and spanning B{v,id,G'{v)) is added to Hd,x. Namely, \B{v,id,G' {v))\ < 0{x ■ edges 
are added to Hd,x- Notice that i > 1. This is due to the fact that v is (a;, c?)-relevant in G'{v). By the 
minimality of the index i, we have \B{v, {i — 1) ■ d, G'{v))\ > x ■ We thus can charge the nodes in 

B{v, (i — 1) • d, G'{v)) with the edges in T{v) added to Hd.x- Note that every node in B{v, (i — 1) • d, G'{v)) is 
charged with at most 0(^^/*') edges. Moreover, the nodes in B{v, (i — 1) • d, G'{v)) are removed from G", and 
thus no node is charged twice. We thus conclude that the number of edges added for this type is 0{nt^/^). 

Consider the second location. Note that every node belongs to exactly one set B{c) for some c £ C. We 
thus add a single edge for every node v G V for this stage. We obtain that 0{n) edges are added for the 
second location. 

Finally, consider the third location. Let v £ C . Since v is (x, d)-relevant in G" and since v is not removed 
in the first stage of the algorithm, \B{v,k ■ d, G")| > x ■ ^C^"!)/*^. The number of edges added for a path 
P(y, A, 2x, G) for some A £ L and y £ B{c) is at most 2 • x. Since there are at most £ labels, at most 2£ • x 
edges are added for the node v. We charge the nodes in B{v, k ■ d, G") with these edges, charging each node 
in B{i\ k ■ d, G") with at most 0{£^^'') edges. Note that since the algorithm adds the node v to C, no node u 
at distance less than k ■ d from some node y' £ B{v, k ■ d, G") will be picked by the algorithm at some later 
step. Thus every node is charged only once. Wc can thus conclude that the number of edges added for the 
third location is 0{ni^/''). 

The lemma follows. | 

The main algorithm for constructing our spanner operates in logn ■ logZ? iterations, where D is the 
diameter of the graph. For a given fixed parameter e, for every index 1 <i < logD and \ < j < logn. invoke 
Algorithm WVL_Cover with parameters d{i) = (14- e)* and x{j) = 2K Let Hd(i)^x(i) be the subgraph 
returned by the Algorithm WVL_Cover. Finally, let H be the union of all subgraphs Hd(i)^x{j) for 1 £ * < 
logi? and 1 < j < logn. This completes our spanner construction. 

The following lemma shows that the stretch of the spanner is [Ak + 1)(1 + e). 

Lemma 7. For every node u gV and label X £ L, dist(M, A, H) < (4fc + 1)(1 + e)dist(it. A, G). 

Proof: Consider a node u € V and a label X G L. Let P = P{u, A) be the shortest path from w to A in G. 
Let i and j be the indices such that (1 + e)'"^ < dist(P) < (1 + e)* and 2-' < |P| < 2-'+^ By Lemma [5j 
dist('u. A, Hd.x) < (4fc + 1)(1 + eY, for d = (1 + e)' and x = 2^. The lemma follows. | 

We thus conclude Theorem [3l 
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